28 research outputs found

    A benchmark and an algorithm for detecting germline transposon insertions and measuring de novo transposon insertion frequencies

    Get PDF
    Transposons are genomic parasites, and their new insertions can cause instability and spur the evolution of their host genomes. Rapid accumulation of short-read whole-genome sequencing data provides a great opportunity for studying new transposon insertions and their impacts on the host genome. Although many algorithms are available for detecting transposon insertions, the task remains challenging and existing tools are not designed for identifying de novo insertions. Here, we present a new benchmark fly dataset based on PacBio long-read sequencing and a new method TEMP2 for detecting germline insertions and measuring de novo \u27singleton\u27 insertion frequencies in eukaryotic genomes. TEMP2 achieves high sensitivity and precision for detecting germline insertions when compared with existing tools using both simulated data in fly and experimental data in fly and human. Furthermore, TEMP2 can accurately assess the frequencies of de novo transposon insertions even with high levels of chimeric reads in simulated datasets; such chimeric reads often occur during the construction of short-read sequencing libraries. By applying TEMP2 to published data on hybrid dysgenic flies inflicted by de-repressed P-elements, we confirmed the continuous new insertions of P-elements in dysgenic offspring before they regain piRNAs for P-element repression. TEMP2 is freely available at Github: https://github.com/weng-lab/TEMP2

    Metazoan tsRNAs: Biogenesis, Evolution and Regulatory Functions

    No full text
    Transfer RNA-derived small RNAs (tsRNAs) are an emerging class of regulatory non-coding RNAs that play important roles in post-transcriptional regulation across a variety of biological processes. Here, we review the recent advances in tsRNA biogenesis and regulatory functions from the perspectives of functional and evolutionary genomics, with a focus on the tsRNA biology of Drosophila. We first summarize our current understanding of the biogenesis mechanisms of different categories of tsRNAs that are generated under physiological or stressed conditions. Next, we review the conservation patterns of tsRNAs in all domains of life, with an emphasis on the conservation of tsRNAs between two Drosophila species. Then, we elaborate the currently known regulatory functions of tsRNAs in mRNA translation that are independent of, or dependent on, Argonaute (AGO) proteins. We also highlight some issues related to the fundamental biology of tsRNAs that deserve further study

    Adaptation of A-to-I RNA editing in <i>Drosophila</i>

    No full text
    <div><p>Adenosine-to-inosine (A-to-I) editing is hypothesized to facilitate adaptive evolution by expanding proteomic diversity through an epigenetic approach. However, it is challenging to provide evidences to support this hypothesis at the whole editome level. In this study, we systematically characterized 2,114 A-to-I RNA editing sites in female and male brains of <i>D</i>. <i>melanogaster</i>, and nearly half of these sites had events evolutionarily conserved across <i>Drosophila</i> species. We detected strong signatures of positive selection on the nonsynonymous editing sites in <i>Drosophila</i> brains, and the beneficial editing sites were significantly enriched in genes related to chemical and electrical neurotransmission. The signal of adaptation was even more pronounced for the editing sites located in X chromosome or for those commonly observed across <i>Drosophila</i> species. We identified a set of gene candidates (termed “PSEB” genes) that had nonsynonymous editing events favored by natural selection. We presented evidence that editing preferentially increased mutation sequence space of evolutionarily conserved genes, which supported the adaptive evolution hypothesis of editing. We found prevalent nonsynonymous editing sites that were favored by natural selection in female and male adults from five strains of <i>D</i>. <i>melanogaster</i>. We showed that temperature played a more important role than gender effect in shaping the editing levels, although the effect of temperature is relatively weaker compared to that of species effect. We also explored the relevant factors that shape the selective patterns of the global editomes. Altogether we demonstrated that abundant nonsynonymous editing sites in <i>Drosophila</i> brains were adaptive and maintained by natural selection during evolution. Our results shed new light on the evolutionary principles and functional consequences of RNA editing.</p></div

    The landscape of A-to-I editomes in <i>D</i>. <i>melanogaster</i>.

    No full text
    <p>(A) A flowchart of A-to-I editing detection in brains of <i>D</i>. <i>melanogaster</i>. Editing sites are classified into five distinct classes based on the decreasing confidence of editing, sequencing coverage, and the number of libraries in which the editing events are detected. (B) Overlaps of the editing sites identified in this study and previous studies. (C) Boxplots of the editing levels of the common and novel sites in each brain library (<i>P</i> < 0.001 in each brain library, KS tests). (D) A summary of the editing sites with respect to their functional annotations. The numbers of high-confidence editing sites in each functional category in <i>D</i>. <i>melanogaster</i> are given in the top panel, and the proportion of editing sites is presented above the bars. mRNA-seq coverage (middle) and editing level (bottom) of editing sites in each category are also shown. For a site, the median value of coverage and editing level across all the libraries (if applicable) is used for the boxplots. (E) The observed numbers of editing sites located in stable hairpin structures and the expected numbers of sites (median and 95% confidence intervals) under randomness. ***, <i>P</i>< 0.001 revealed by simulations.</p

    A-to-I editing increases mutation sequence space of evolutionarily conserved genes.

    No full text
    <p>(A) The editing density in the <i>N</i> sites is significantly inversely correlated with the <i>dN</i> value (between <i>D</i>. <i>melanogaster</i> and <i>D</i>. <i>simulans</i>) of the host genes. The genes expressed in brains are ranked with increasing <i>dN</i> values and divided into 20 bins (the <i>x</i>-axis, and lower <i>dN</i> means evolutionarily more conserved). The left and right panel is for 1- to 5-day female (B2) and male (B6) brains of <i>D</i>. <i>melanogaster</i>, respectively (<a href="http://www.plosgenetics.org/article/info:doi/10.1371/journal.pgen.1006648#pgen.1006648.t001" target="_blank">Table 1</a>). In each bin, the editing density (<i>y</i>-axis) is calculated by dividing the observed number of editing sites with the total number of adenosine sites that cause amino acid changes if edited. (B) The editing density in the <i>N</i> sites is significantly positively correlated with the phyloP score of the sites. All the nonsynonymous adenosine sites (cause amino acid changes if edited; ≥ 5X sequencing coverage) are ranked with increasing phyloP scores and grouped into 20 bins (<i>x</i>-axis, and higher phyloP score means evolutionarily more conserved). (C) The editing density of the <i>N</i> sites is significantly lower in the non-conserved compared to conserved sites after controlling mRNA-Seq coverage. All the nonsynonymous adenosine sites (cause amino acid changes if edited; ≥ 5X) are ranked with increasing sequencing coverage and binned into 20 categories (<i>x</i>-axis). Within each bin, we further divided the sites into two equal-sized subgroups based on the phyloP scores. The <i>y</i>-axis is the editing density of the non-conserved relative to the conserved subgroup in each bin.</p

    The effect of temperature on editing levels in brains of <i>Drosophila</i>.

    No full text
    <p>(A) The changes of editing levels in <i>N</i> and silent (<i>S</i> and UTRs) sites in female and male brains under elevated temperature (stressed at 30°C for 48 hours). (Error bar represents the s.e. of the level changes for editing sites in each category). (B) The flanking sequences (100 nts at each side) have significantly lower MFE (Kcal/mol) for the <i>N</i> sites compared to the silent sites. (C) Clustering the brain libraries of <i>D</i>. <i>melanogaster</i> based on the editing levels of 391 high-confidence editing sites that have at least 20 raw reads in each brain library. Note flies of the same accommodation conditions always cluster together. (D) Clustering the brain libraries of <i>D</i>. <i>melanogaster</i> and <i>D</i>. <i>simulans</i> based on the editing levels of 289 high-confidence editing sites that have at least 20 raw reads in each brain library. Note species divergence plays a more important role than temperature in clustering the samples.</p

    The effect of local nucleotide contexts on editing in brains of <i>D</i>. <i>melanogaster</i>.

    No full text
    <p>(A) A 7-mer motif centered with the high-confidence editing sites. (B) The score cutoff that specified the top 90% quantile of the high-confidence editing sites. (-0.622) corresponds to the top 75.4% of all the 7-mer sequences centered with adenosine in the genes with editing events.</p

    Conservation patterns of editing sites in brains of <i>D</i>. <i>melanogaster</i> and two other species.

    No full text
    <p><b>“+”,</b> the high-confidence editing sites were reliably detected in a species (Top). Bottom: possible gain and loss patterns of 87 sites that have a minimal editing level of 0.05 in <i>D</i>. <i>melanogaster</i> and have at least 200 raw reads in both <i>D</i>. <i>simulans</i> and <i>D</i>. <i>pseudoobscura</i>. “-”, the orthologous site is not edited with high probability [joint <i>P(D</i><sub><i>0</i></sub><i>)</i> < 0.0002].</p

    The editing sites detected in female and male adults in five strains of <i>D</i>. <i>melanogaster</i>.

    No full text
    <p>The editing sites detected in female and male adults in five strains of <i>D</i>. <i>melanogaster</i>.</p
    corecore